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Abstract 

We show that every almost universal hash function also has the storage enforcement property. Al- 
most universal hash functions have found numerous applications and we show that this new storage 
enforcement property allows the application of almost universal hash functions in a wide range of re- 
mote verification tasks: (i) Proof of Secure Erasure (where we want to remotely erase and securely 
update the code of a compromised machine with memory-bounded adversary), (ii) Proof of Ownership 
(where a storage server wants to check if a client has the data it claims to have before giving access to 
deduplicated data) and (iii) Data possession (where the client wants to verify whether the remote storage 
server is storing its data). Specifically, storage enforcement guarantee in the classical data possession 
problem removes any practical incentive for the storage server to cheat the client by saving on storage 
space. 

The proof of our result relies on a natural combination of Kolmogorov Complexity and List De- 
coding. To the best of our knowledge this is the first work that combines these two techniques. We 
believe the newly introduced storage enforcement property of almost universal hash functions will open 
promising avenues of exciting research under memory-bounded (bounded storage) adversary model. 

Keywords: Kolmogorov Complexity, List Decoding, Almost Universal Hash family, Data Possession, Proof 
of Retrievability, Proof of Ownership, Reed-Solomon Codes, CRT codes. 
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1 Introduction 



Universal hashing was denned in the seminal paper of Carter and Wegman and since then has been 
an integral part of both complexity and algorithms research. In this paper we consider the following well 
known generalization of universal hashing called e-almost universal 11281 . In particular, a family % = 
{Hi, . . . , H n } of hash functions (where Hi : [q] k — > [q]j^is an e-almost universal hash family if for every 
i^t/S [q] k , we have 

Pr [hi(x) = hi(y)} s=e. 

ie[n] 

(l/g-almost universal hash families are universal hash families as defined in ll6l.) 

In this paper, we show that any e-almost universal hash family also has the "storage enforcement" prop- 
erty. In particular, we call a hash function family % = {Hi, ... , H n } to be (7, /(-))-storage enforceable if 
the following holds. A prover claims to have x G [q] k . The prover is allowed to perform arbitrary computa- 
tion on x and retain its output y G [q]* with it. Then the following holds for any x G [g] fc : if the prover can 
comput^l hi(x) with probability at least 7 (for a uniformly random i G [n]), then \y\ ^ f{x). 

Our main result is the following: 

Theorem 1. If a computable hash function family % = {Hi, . . . , H n } is e-almost universal, then it is also 
(y/e, f \-))-storage enforceable, where f(x) ~ C(x), where C{x) is the plain Kolmogorov complexity of x. 

The storage enforcement property is interesting in its own right and we present its applications in prob- 
lems of proof of secure erasure, proof of ownership and data possession. 

Before discussing the applications of storage enforcement, we present the techniques we use to prove 
Theorem [T] which to the best of our knowledge is the first instance of combining list decoding ||9j |3T| and 
Kolmogorov complexity |fT9ll20ll naturally to give interesting results. Both Kolmogorov complexity (see the 
textbook ||20l ) and list decoding (see the survey by Sudan ll29l and Guruswami's thesis El) have found 
numerous applications in complexity theory. We believe that this combination merits more exploration and 
hope that this work will lead to a more systematic study. 

Our techniques. We begin our discussion with the following related result. Let H : [q] k — > [q] n be an 
error-correcting code with good list decodability properties- see Section 12.11 for coding definitions. (We 
do not need an algorithmic guarantee, as just a combinatorial guarantee suffices.) Consider the natural 
hash family % = {hi, . . . , h n }, where hi(x) = H(x)i. We now argue that % has storage enforcement 
property. In particular, we quickly sketch why the prover cannot get away with storing a vector y such 
that \y\ is smaller than C(x), the plain Kolmogorov complexity of x, by some appropriately small additive 
factor. Since we are assuming that the prover uses an algorithm A x to compute its answer A x ((3,y) to the 
random challenge (3 G [n], if the prover's answer is correct with probability at least 7, then note that the 
vector (A x (/3, y))p^[ n ] differs from H{x) in at most 1 — 7 fraction of positions. Thus, if H has good list 
decodability, then (using A x ) one can compute a list {27, . . . , 27} that contains x. Finally, one can use log L 
bits of advice (in addition to y) to output x. This procedure then becomes a description for x and if \y\ is 
sufficiently smaller than C(x), then our description will have size < C{x), which contradicts the definition 
of C(x). 

The above result then implies Theorem \T\ because (1) Every e-almost universal hash family defines a 
code H (in exactly the same way as we defined corresponding hash family 7~L from the code H in the 

'Our results can also handle the case where each each Hi : [q] k — > [qt] for potentially different q; for each i £ [n]. For 
simplicity, we will concentrate on the case of qi = q for every i G [n] . 

2 We will assume the prover uses an algorithm A x to compute A x (i, y) as its version of hi(x). 
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paragraph above) with (relative) distance at least 1 — e; (2) Every code H with relative distance 1 — e can 
be (combinatorially) list decoded from 1 — yfe fraction of errors by the Johnson bound (cf. ITT41 ). 

From a more practical point of view, the Karp Rabin hash corresponds to H being so-called the "Chinese 
Remainder Theorem" (CRT) code and the polynomial hash corresponds to H being the Reed-Solomon 
code. Reed-Solomon and CRT codes have good list decodability, which implies that our protocols can be 
implemented using the classic Karp Rabin and polynomial hashes. 

Next, we present three applications of hash functions that have storage enforcement. 

Proof of Secure Erasure. Often, it is important to verify the internal state of a remote embedded device 
such as a sensor, actuator or computer peripheral with limited memory. This is to assure that it is running 
the intended code and not a malicious or arbitrary code. When any suspicious behavior of the remote device 
is detected, sometimes it becomes necessary to erase the entire memory content and then update it with 
legitimate code. Two crucial components of this process is to make sure that (i) the remote device has to 
store a random string at least as large as the memory size and (ii) efficiently verify that the device has indeed 
stored the random string remotely. A similar problem has been studied before, under the name of Proof 
of Secure Erasure 1211 using cryptographic primitives which is, in most cases, too expensive for resource- 
constrained embedded devices. However, almost universal hash functions with the storage enforcement 
property applies naturally to provide an efficient proof of secure erasure scheme as outlined below. 

The prover can use any storage enforceable hash family, along with a randomly-generated string x, 
(which will, with constant probability greater than 1/2, have C(x) ^ (\x\ — 0(l))jj, to force the remote 
device (verifier) to first store a string of length close to \x\ (overwriting any malicious code as long as |x| 
is chosen large enough), second, to correctly answer a verification request (a failure to answer the request 
proves that the initial erasure has failed), and third (now that there is no room for a malicious code) to install 
the updated code. 

Apart from efficiency issues, the cited work EE] also assumes that the remote device stores some portion 
of its trusted code in a fixed ROM. However, a universal hash function based scheme with storage enforce- 
ment is not strictly dependent on such an assumption. We do, however, (as does ED) require that the remote 
machine not have network access to an external storage which is a very reasonable assumption in memory 
bounded adversary model. 

Proof of Ownership. Consider the case when a client wants to upload it's data x to a storage service 
(such as Dropbox) that uses 'deduplication.' In particular, if x is already stored on the server, then the 
server will ask the client not to upload x and give it ownership of x. To save on communication, the sever 
asks the client to send it a hash h(x) and if it matches the hash of the stored x on the server, the client 
is issued an ownership of x. As identified by lfT6l . this simple deduplication scheme can be abused for 
malicious purposes. For example, if a malicious user gets hold of the hashes of the files stored on the server 
(through server break-in or unintended leakage), it will get access to those files. The simple deduplication 
scheme can also be exploited as a unintended content distribution network. A malicious user can upload a 
potentially large file and share the hash of the file with accomplices. Now, all these accomplices can present 
the hash to storage server and get access to large file as if it is a content distribution network. A storage 
enforceable hash function can address such a situation. Specifically, if one has a hash family {h±, . . . ,h n } 
that is storage enforceable, then the server can pick a random i G [n] and ask the client to send it hi(x). 
If the client succeeds in doing so then, the server knows the client has a lot of information (close to C(x)) 
related to x and not just a small fingerprint of x. 

3 The exact bound is very strong. C{x) (\x\ — r) for at least a (1 — l/2 r ) -fraction of the strings of length \x\. 
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Proof of ownership is studied formally in lTT6l . In addition to the scheme being computation intensive, 
the security guarantee in lTT6l is based on the min-entropy of the original file dealing with the probability 
distribution of input data blocks, which we believe (as also agreed by the authors of |[T6l ) is not the best 
measure to enforce storage. We do not make any assumptions on what kind of data we are storing. Due to 
this, Kolmogorov complexity is a better option as it is defined for each specific block of data, which is at 
the core of our solution. Also, majority of the existing data possession schemes (as discussed in the next 
application) could not be used to address proof of ownership. Because, they require the prover (client) to 
store x in a modified form which is an infeasible assumption for this application. 

Data Possession. This application considers the following problem: A string x G [q] k is held by the 
client/user, then transmitted to a remote server (or broken up into several pieces and transmitted to several 
remote servers). At some later point, after having sent x, the client would like to know that the remote 
server(s) is storing x correctly. The problem has been studied under the name of proof of retrievability 
(cf. H7J |5j HI) and/or data possession (cf. CD |2l |7] [10]]). Given the greater prevalence of outsourcing of 
storage (e.g. in "cloud computing"), such a verification procedure is crucial for auditing purposes to make 
sure that the remote servers are following the terms of their agreement with the client. 

In contrast with the existing schemes, our approach to the remote server adversarial model is rather 
practical. We assume that the storage provider is not fully malicious: e.g., it is reasonable to assume that 
Amazon will not try and mess with an individual's storage in a malicious way. However, the server will not 
be completely honest either. We make the natural assumption that the servers' main goal is to save on its 
storage space (since it practically helps the server to save on the cost of storage medium, networking and 
power): so if it can pass the verification protocol by storing substantially less than \x\ amounts of data, then 
it will do so@ In our scheme, we will strive to make this impossible for the server to do without having a 
high probability of getting caught. 

Also, before we move on to our scheme, we would like to make a point that does not seem to have been 
made explicitly before. Note that we cannot prevent a server from compressing their copy of the string, or 
otherwise representing it in any reversible way. (Indeed, the user would not care as long as the server is able 
to recreate x.) This means that a natural upper bound on the amount of storage we can force a server into 
is C(x), the (plain) Kolmogorov complexity of x, which is the size of the smallest (algorithmic) description 
of x. 

Our scheme to address the data possession problem is determined by a (7, /(-))-storage enforceable hash 
family {hi,..., h n }. In the pre-processing step, the client picks a random i € [n], stores (i, hi(x)) and ships 
off x to the server. During the verification step, the client send i and asks for hi(x). Upon receiving the reply 
z from the server, the client checks whether z = hi(x). Since the hash family was storage enforceable, if the 
server passes the verification protocol with probability 7 > 0, then they provably have to store f(x) (which 
in our results will be C(x) up to a very small additive factor) bits of data. A cheating server is allowed 
to use arbitrarily powerful algorithms (as long as they terminate) while responding to challenges from the 
userJl Further, every honest server only needs to store x. In other words, unlike some existing results, our 
protocol does not require the server(s) to store a massaged version of x. In practice, this is important as we 
do not want our auditing protocol to interfere with "normal" read operations. However, unlike many of the 
existing results based upon cryptographic primitives, our protocol can only allow a number of audits that 

4 In other words, the server is willing to spend a lot of computational power to save on its space usage but is not maliciously 
trying to cheat the client. 

5 The server(s) can use different algorithms for different strings x but the algorithm cannot change across different challenges 
for the same x. 
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is proportional to the user's local storagejj Further, as a demonstration of the versatility of our technique, 
it'll turn out that our schemes can also provide a proof of retrievability (see e.g. Remark [T}, but we do not 
claim any improvement over existing work on proofs of retrievability, e.g. (H. In general, our single prover 
scheme is veiy similar to one of the schemes discussed in JSJ. For example, one of the instantiations of our 
scheme with Reed-Solomon codes in fact resembles exactly one of their scheme. However, some crucial 
differences in technical contribution of the schemes are the following. First, even though both schemes 
use list decodability of codes, the schemes differ in how one "prunes" the output of list decoding. The 
scheme in |8] uses extra hash functions while our work uses Kolmogorov complexity. Second and perhaps 
more importantly, we believe that the main contribution of our work is to formally establish the storage 
enforcement properties for every almost universal hash function, and showing a range of applications of 
these storage enforceable hash functions. 

Our techniques are very flexible. As mentioned earlier, our protocols also imply a proof of retrievability. 
As a further demonstration of this flexibility, we show that our results seamlessly generalize to the multiple 
server case. We show that we can allow for arbitrary collusion among the servers if we are only interested 
in checking if at least one sever is cheating (but we cannot identify it). If we want to identify at least one 
cheating server (and allow for servers to not respond to challenges), then we can handle a moderately limited 
collusion among servers. 

Theorem Q] along with the above then implies that any almost universal hash family implies a valid 
verification protocol for the data possession problem. Our protocol with best parameters, which might 
need the user and honest server(s) to be exponential time algorithms, provably achieves the optimal local 
storage for the user and the optimal total communication between the user and the server(s). By picking an 
appropriate almost universal hash family, with slightly non-optimal storage and communication parameters, 
the user and the honest server(s) can work with single pass logarithmic space data stream algorithms. 

2 Preliminaries 

We now formally define different parameters of a verification protocol based on storage enforceable hash 
functions. For the secure erasure application, the administrator/user is the verifier and the remote embedded 
device is the prover. In the proof of ownership problem, the deduplication server is the verifier and the client 
is the prover. For the data possession problem, the client is the verifier and the storage server is the prover. 
For clarity, rest of the discussion will focus on the data possession application to accomodate both single 
prover and multiple prover cases. 

Basic Verification Scheme. We use ¥ q to denote the finite field over q elements. We also use [n] to denote 
the set {1, 2, .., n}. Given any string x £ [q]*, we use \x\ to denote the length of x in bits. Additionally, all 
logarithms will be base 2 unless otherwise specified. We use U to denote the verifier. 

For the data possession problem, we define the verification protocol in terms of multiple prover case 
first and then discuss single prover as a specific case for ease of dicussion. We assume that U wants to store 
its data x £ [q] k among s service providers (or provers) Pi,...,P s . For the proof of ownership problem, 
the client (or the prover) Pi has the data x. In the preprocessing step, the provers V = {Pi, . . . , P s } get x 

6 An advantage of proving security under cryptographic assumptions is that one can leverage existing theorems to prove addi- 
tional security properties of the verification protocol. We do not claim any additional security guarantees other than the ability of 
being able to force servers to store close to C(x) amounts of data. 
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divided up equally among the s provers - we will denote the chunk for prover i G [s] as Xi G [q] 

n/s^ (For 

the data possession problem, we can assume that U does this division and sends Xi to Pi.) Each prover is 
then allowed to apply any computable function to its chunk and to store a string y» G [q\*. Ideally, we would 
like yi = Xi. However, since the provers can compress x, we would at the very least like to force \y,i\ to be 
as close to C(xi) as possible. For notational convenience, for any subset T C [s], we denote yT (xt resp.) 
to be the concatenation of the strings {yi}i G T ({^ijier resp.)- 

To enforce the conditions above, we design a protocol. We will be primarily concerned with the amount 
of storage at the verifier side and the amount of communication and want to minimize both simultaneously 
while giving good verification properties. The following definition captures these notions. (We also allow 
for the provers to collude among each other.) 

Definition 1. Let s,c,m ^ 1 and ^ r $C s be integers, ^ p ^ I be a real and f : [<?]*—>• K>o be a 
function. Then an (s, r)-party verification protocol with resource bound (c, m) and verification guarantee 
(p, /) is a randomized protocol with the following guarantee. For any string x G [q] k , U stores at most m 
bits and communicates at most c bits with the s provers. At the end, the protocol either outputs a 1 or a 0. 
Finally, the following is true for any T C [s] with \T\ ^ r: If the protocol outputs a 1 with probability at 
least p, then assuming that every prover i G [s]\ T followed the protocol and that every prover in T possibly 
colluded with one another, we have \yx\ ^ J(xt)- 

We will denote a (1, l)-party verification protocol as a one-party verification protocol. (Note that in this 
case, the single prover is allowed to behave arbitrarily.) 

All of our protocols will have the following structure: we first pick a hash family. The protocol will pick 
random hash(es) and store the corresponding hash values for x (along with the indices of the hash functions) 
during the pre-processing step. During the verification step, U sends the indices to the hashes as challenges 
to the s provers. Throughout this paper, we will assume that each prover i has a computable algorithm A x .i 
such that on challenge j3 it returns an answer Ax,i(P, Vi) to U. The protocol then outputs 1 or by applying 
a (simple) boolean function on the answers and the stored hash values. 

In particular, for the one-party verification protocol, we have the following simple result, which we 
prove in the appendix. 

Proposition 2. Let % = {h\,...,h n } be a hash family where hi : [q] k — > [q] that is (7, f)-storage 
enforceable. Then there exists a one party verification protocol with resource bound (log q + log n, log q + 
log n) and performance guarantee (7, /). 

2.1 Coding Basics 

We begin with some basic coding definitions. An error-correcting code H with dimension k ^ 1 and block 
length n ^ k over an alphabet of size q is any function H : [q] k — > [q] n . A linear code H is any error- 
correcting code that is a linear function, in which case we correspond [q] with ¥ q . A message of a code 
H is any element in the domain of H. A codeword in a code H is any element in the range of H. The 
Hamming distance A(x, y) of two same-length strings is the number of symbols in which they differ. The 
relative distance 5 of a code is mm x ^ y A ^'^ , where x and y are any two different codewords in the code. 

The following connection between almost universal and codes is well known. The straightforward proof 
is in the appendix. 

7 We will assume that s divides n. In our results for the case when H is a linear code, we do not need the x^'s to have the same 
size, only that x can be partitioned into x\, . . . , x s . We will ignore this possibility for the rest of the paper. 
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Proposition 3 (|4|). Let T~L = {hi, . . . , h n } be an e-almost universal hash family with hi : [q] k — > [q\. Now 
consider the related code q-ary code H with block length n and dimension k where H(x) = {hi{x))i^\ n y 
Then H has relative distance at least 1 — e. 

Definition 2. A (p, L) list-decodable code is any error-correcting code such that for every vector e in the 
codomain ofH, the number of codewords that are Hamming distance pn or less from e is always L or fewer. 

Johnson Bound. We now state a general combinatorial result for list decoding codes with large distance, 
which will be useful. The result below allows for a sort of non-standard definition of codes, where a code- 
word is a vector in Y\i=i fe]> where the %'s can be distinct J§ (So far we have looked only at the case where 
qi = q for i S [n].) The notion of Hamming distance still remains the same, i.e. the number of positions 
that two vectors differ in. (The syntactic definitions of the distance of a code and the list decodability of a 
code remain the same.) We will need the following result: 

Theorem 4 (' 1141 '). Let C be a code with block length n and distance d where the ith symbol in a codeword 
comes from [qi]. Then the code is ^1 — 1 — ^, 2 Ya=i Qi) '^ st decodable. 

Theorem[4]and Proposition [3]implies the following: 

Corollary 5. Let % = {hi, . . . ,h n } be an e-almost universal hash family with hi : [q] k — > [q]. Now 
consider the related code q-ary code H where H(x) = (hi(x)) ie [ n y Then H is (1—^/e, 2qn)-list decodable. 

For the purposes of this paper, we only consider codes H that are members of a family of codes If., any 
one of which can be indexed by (k, n, q, pn). 

Plain Kolmogorov Complexity. 

Definition 3. The plain Kolmogorov Complexity C(x) of a string x is the minimum sum of sizes of a 
compressed representation of x, along with its decompression algorithm D, and a reference universal Turing 
machine that runs the decompression algorithm. 

Because the reference universal Turing machine size is constant, it is useful to think of C(x) as simply 
measuring the amount of inherent (i.e. incompressible) information in a string x. 

3 Single Prover Storage Enforcement 

We begin by presenting our main result for the case of one prover (s = 1) to illustrate the combination of 
list decoding and Kolmogorov complexity. In the subsequent section, we will generalize our result to the 
multiple prover case (for the data possession problem). 

Theorem 6. For every computable error-correcting code H : [q] k — > [q] n that is (p,L) list-decodable, 
define the hash family % = {hi, . . . , h n } as hi(x) = H{x)i. Then % is a (1 — p, f)-storage enforceable, 
where for any x G [q] k , f(x) = C(x) — log(qLn 3 ) — 21oglog((/n) — Co, for some fixed constant 

8 We're overloading the product operator Yl here to mean the iterated Cartesian product. 

'The contribution from the encoding of the constants in this theorem to Co is 2. For most codes, we can take Co to be less than 
a few thousand. What is important is that the contribution from the encoding is independent of the rest of the constants in the 
theorem. Although we do not explicitly say so in the body of the theorem statements, this important fact is true for the rest of the 
results in this paper as well. 
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Proof. We assume that the prover, upon receiving x, saves a string y G [g]*. The prover is allowed to use any 
computable function to obtain y from x. Further, we assume that the prover, upon receiving the (random) 
challenge /3 G [n], uses a computable function A x : [n] x [q]* — >■ [q] to compute a = A x (f3,y). 

de f 

We prove the storage enforceable property by contradiction. Assume that \y\ < f(x) = C(x) — 
log(qLn 3 ) — 2 log log(gn) — cq and yet A x (/3, y) = hp{x) = H(x)r with probability at least 1 — p (over 
the choice of /3). Define z = (A x (/3, y))/3e[n}- Note that by the claim on the probability, A(z, H{x)) ^ pn. 
We will use this and the list decodability of the code H to prove that there is an algorithm with description 
size < C(x) to describe x, which is a contradiction. To see this, consider the following algorithm that uses 
y and an advice string v G {0, 1}' L ' : 

1. Compute a description of H from n,k, pn and q. 

2. Compute z = (^(A 2/))/3e[n]- 

3. By cycling through all x G [q] k , retain the set C C [q] k such that for eveiy u G C, A(H(u), z) ^ pn. 

4. Output the vth string from C. 

Note that since H is (p, L)-list decodable, there exists an advice string v such that the algorithm above 
outputs x. Further, since H is computable, there is an algorithm £ that can compute a description of H from 
n, k, pn and q. (Note that using this description, we can generate any codeword H{u) in step 3.) Thus, we 
have description of x of size \y\ + |t>| + lA^I + \£\ + (3 log n + log q + 2 log log n + 2 log log q + 2) (where 
the last term is for encoding the different parameters! 1Q I). which means if \y\ < C(x) — \v\ — \ A X \ — \£\ — 
(3 log n + log q + 2 log log n + 2 log log q + 2) = f{x), then we have a description of x of size < C (x), 
which is a contradiction. 

□ 

The one unsatisfactory aspect of the result above is that if H is not polynomial time computable, then 
Step 2 in the pre-processing step for U is not efficient. Similarly, if the sever is not cheating (and e.g. stores 
y = x), then it cannot also compute the correct answer efficiently. We will come back to these issues when 
we instantiate H by an explicit code such as Reed-Solomon. 

Remark 1. If H has relative distance 5, then note that T~L is (1 — 5/2 — e, C(x) — log(gn 3 ) — 2 log log(qn) — 
Co)-storage enforceable for some fixed constant cq. Further, y has enough information for the prover to 
compute x back from it. (It can use the same algorithm to compute x from y detailed above, except it does 
not need the advice string v, as in Step 3 we will have C = {x}.) For the more general case when H is 
(p, L)-list decodable and % is (1 — p, C (x) — log qLn s — 2 log log qn — c^j-storage enforceable, then y has 
enough information for the prover to compute a list C 5 {x} with \C\ ^ L. The verifier, if given access to 
C, can use its randomly stored hp(x) to pick x out of C with probability at least 1 — 8L. 

Theorem [6] along with Proposition |2]implies the following result: 

Theorem 7. For every computable error-correcting code H : [q] k — > [q] n that is (p, L) list-decodable, there 
exists a one-party verification protocol with resource bound (log n + log q, log n + log q) and verification 
guarantee (1 — p, f), where for any x G [q] k , f(x) = C(x) — log(qLn 3 ) — 2 log log(gn) — cq, for some 
fixed constant cq. 

10 We use a simple self-delimiting encoding of q and n, followed immediately by k and pn in binary, with the remaining bits used 
for v. A simple self-delimiting encoding for a positive integer u is the concatenation of: ( [log( |ii| )~| in unary, 0, |u| in binary, u in 
binary). We omit the description of this encoding in later proofs. 
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4 Multiple Prover Storage Enforcement 



In this section, we show how to extend the result from Theorem |7] to the multiple server case. In the first 
two sub-sections, we will implicitly assume the following: (i) We are primarily interested in whether some 
server was cheating and not in identifying the cheater(s) and (ii) We assume that all servers always reply 
back (possibly with an incorrect answer). 

Trivial Solution. We begin with the following direct generalization of Theorem [7] to the multiple server 
case: essentially run s independent copies of the protocol from Theorem [7] 

Theorem 8. For every computable error-correcting code H : [q\ k / s — > [q] n that is (p,L) list-decodable, 
there exists an (s, s)-party verification protocol with resource bound (log n + s log q, s(log n + log q)) and 
verification guarantee (l — p,f), where for any x € [q] k , f(x) = C(x) — s — log(s 2 qL s n 4: ) — 2loglog(qn) — 
Co, for some fixed positive integer cq. 

Multiple Parties, One Hash. One somewhat unsatisfactory aspect of Theorem[8]is that the storage needed 
by U goes up a factor of s from that in Theorem [7] Next we show that if the code H is linear (and list 
decodable) then we can get a similar guarantee as that of Theorem [8] except that the storage usage of U 
remains the same as that in Theorem [7] 

Theorem 9. For every computable linear error-correcting code H : F k — > F™ that is (p, L) list-decodable, 
there exists an (s, s)-party verification protocol with resource bound (log n + log q, s(log n + log q)) and 
verification guarantee (1 — p, /), whereforany x G F*, f(x) = C (x) — s —\og(s 2 qLn 4 ) —2\og\og(qn) — cq, 
for some fixed positive integer cq. 

Catching the Cheaters and Handling Unresponsive Servers. The protocol in Theorem [8] checks if the 
answer from each server is the same as corresponding stored hash. This implies that the protocol can easily 
handle the case when some server does not reply back at all. Additionally, if the protocol outputs a then 
it knows that at least one of the servers in the colluding set is cheating. (It does not necessarily identify the 
exact set T0) 

However, the protocol in Theorem [9] cannot identify the cheater(s) and needs all the servers to always 
reply back. Next, using Reed-Solomon codes, at the cost of higher user storage and a stricter bound on the 
number of colluding servers, we show how to get rid of these shortcomings. 

Recall that a Reed-Solomon code RS : F™ — > F q can be represented as a systematic code (i.e. the first 
k symbols in any codeword is exactly the corresponding message) and can correct r errors and e erasures as 
long as 2r + e ^ — m. Further, one can correct from r errors and e erasures in 0( 3 ) time. The main idea 
in the following result is to follow the protocol of Theorem [8] but instead of storing all the s hashes, U only 
stores the parity symbols in the corresponding Reed-Solomon codeword. 

Theorem 10. For every computable linear error-correcting code H : FC? — > F" that is (p, L) list-decodable, 
assuming at most e servers will not reply back to a challenge, there exists an (r, s)-party verification protocol 
with resource bound (log n + (2r + e) • log q), s(log n + log q)) and verification guarantee (1 — p, f), where 
for any x £ F^, f(x) = C(x) — s — log(s 2 qLn 4 ) — 21oglog((/n) — cq, for some fixed positive integer cq. 

11 We assume that identifying at least one server in the colluding set is motivation enough for servers not to collude. 
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5 Corollaries 



We now present specific instantiations of list decodable codes H to obtain corollaries of our main results. 

Optimal Storage Enforcement. We begin with the following observation: If the reply from a prover 
comes from a domain of size q, then one cannot hope to have a (5, /)-storage enforceable hash family for 
any 5 ^ l/q for any non-trivial /. This is because the prover can always return a random value and guess 
any hi{x) with probability l/q. 

Next, we show that we can get 5 to be arbitrarily close to l/q while still obtaining f(x) to be very close 
to C(x). We start off with the following result due to Zyablov and Pinsker: 

Theorem 11 ([32]). Let q ^ 2 and letO< p < 1 — l/q. There exists a (p, L)-list decodable code with rate 
l-H q (p)-l/L. 

It is known that for e < l/q, H q (l - l/q - e) ^ 1 - C q e 2 , where C q = g/(41ng) [M Chap. 
2]. This implies that there exists a code H : ¥ q — > ¥ q , with n ^ ^ _^ 1)e a ^ 8k\\\q/ (qe 2 ), which is 

(1 — l/q — s, l/e 2 )-list decodable. Note that the above implies that one can deterministically compute a 
uniquely-determined such code by iterating over all possible codes with dimension k and block length n 
and outputting the lexicographically least such one that is (1 — l/q — e, L)-list decodable with the smallest 
discovered value of L. Applying this to Theorem [6] implies the following optimal result: 

Corollary 12. For every e < l/q and integer s ^ 1, there exists a {l/q + e, f)-storage enforceable hash 
family, where for any x G [q] k , f(x) = C(x) — s — log s 2 k 4 + loge 2s+8 — loglogg 6 fc 2 + log log e 4 — 
log log log q s — Co for some fixed positive integer cq. 

Some of our results need H to be linear. To this end, we will need the following result due to Guruswami 
etalH 

Theorem 13 ( 11151 ). Let q ^ 2 be a prime power and let § < p < 1 — l/q. Then a random linear code of 
rate 1 — H q {p) — e is (p, C ' Piq / 'e)-list decodable for some term C p ^ q that just depends on p and q. 

As a Corollary the above implies (along with the arguments used earlier in this section) that there exists 
a linear code H : F|? — > F" with n ^ 8k In q/(qe 2 ) that is (1 - l/q — e, C' £ )(J /e 2 )-list decodable (where 

C' = Ci_i/ q _ e q ). Applying this to Theorem [lOl gives us the following 

Corollary 14. For every e < l/q, integer s ^ 1, and r,e ^ s, assuming at most e servers do not reply 
back to a challenge, there exists an (r, s)-party verification protocol with resource bound (log kq 2r+e ^ 1 — 
log e 2 + log log q^l 2 + 3, s(log k — log e 2 + log log q 3 / 2 + 3)) and verification guarantee (l/q + e, f), where 
foranyx G [q] k , f(x) = C(x) — s — log s 2 C^ 9 A; 4 + logg 3 e 10 — loglogg 6 fc 2 + logloge 2 — log log log g 2 — cq 
for some fixed positive integer cq. 

Proof of Theorem [TJ Corollary |5]and Theorem |6]proves Theorem [T]with f(x) = C{x) — 0(log q + \og n). 
12 The corresponding result for general codes has been known for more than thirty years. 
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Practical Storage Enforcement. All of our results so far have used (merely) computable codes H, which 
are not that useful in practice. What we really want in practice is to use codes H that lead to an efficient 
implementation of the protocol. At the very least, all the honest parties in the verification protocol should 
not have to use more than polynomial time to perform the required computation. An even more desirable 
property would be for honest parties to be able to do their computation in a one pass, logspace, data stream 
fashion. In this section, we'll see one example of each. Further, it turns out that the resulting hash functions 
are classical ones that are also used in practice. In particular, using the Karp-Rabin hash and the polynomial 
hash we get the following results. (More details are in the appendix.) 

Corollary 15. For every e > 0, there exists an (s, s)-party verification protocol with resource bound 

((s + 1) log k + s — loge 4 + s loglog(/c/e 2 ), s log k — s logs 2 + s + slog log (k/e 2 )) 

with verification guarantee (e, /), where for every x G {0, 1, ... , Yii=i Pi ~ 1}> 

f(x) = C(x) - c (s(log(fc/e) - loglog(fc/e) - 1)) - c x log log log (/c/e) - c 2 

for some fixed positive integers cq, c\ and Oi- Further, all honest parties can do their computation in poly(n) 
time. 

Corollary 16. For every e > 0, 

(i) There exists an (s, s)-party verification protocol with resource bound ((s + l)(log k + 21og(l/e) + 
l),2s(log/c + 21og(l/e) + 1)) and verification guarantee (e, /), where for any x G F^, f(x) = 
C(z)-0(s(logfc + log(l/e))). 

(ii) Assuming at most e servers do not respond to challenges, there exists an (r, s)-party verification 
protocol with resource bound ((2r + e + l)(log k + 21og(l/e) + 1), 2s(log k + 21og(l/e) + 1)) and 
verification guarantee (e, /), where for any x G F^, f(x) = C(x) — 0(s + log k + log(l/e)). 

Further, in both the protocols, honest parties can implement their required computation with a one pass, 
0(log k + log(l/e)) space (in bits) and 0(log k + log(l/e)) update time data stream algorithm. 
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A Related Works 



Existing approaches for data possession verification at remote storage can be broadly classified into two 
categories: Crypto-based and Coding based. Crypto-based approaches rely on symmetric and asymmetric 
cryptographic primitives for proof of data possession. Ateniese et al. HI defined the proof of data possession 
(PDP) model which uses public key homomorphic tags for verification of stored files. It can also support 
public verifiability with a slight modification of the original protocol by adding extra communication cost. 
In subsequent work, Ateniese et al. proposed a symmetric crypto-based variation (SEP) which is compu- 
tationally efficient compared to the original PDP but lacks public verifiability. Also, both of these protocols 
considered the scenario with files stored on a single server, and do not discuss erasure tolerance. However, 
Curtmola et al. [7.1 extended PDP to a multiple-server scenario by introducing multiple identical replicas of 
the original data. Among other notable constructions of PDP, Gazzoni et al. lfTTI proposed a scheme (DDP) 
that relied on an RSA-based hash (exponentiating the whole file), and Shah et al. E71 proposed a symmetric 
encryption based storage audit protocol. Recent extensions of crypto-based PDP schemes by Wang et al. 
(EPV) Il30ll and Erway et al. iPTOl mainly focus on supporting data dynamics in addition to existing capabil- 
ities. Golle et al. lfl2l had proposed a ciyptographic primitive called storage enforcing commitment (SEC) 
which probabilistically guarantees that the server is using storage whose size is equal to the size of the orig- 
inal data to correctly answer the data possession queries. In general, the drawbacks of the aforementioned 
protocols are: (a) being computation intensive due to the usage of expensive cryptographic primitives and 
(b) since each verification checks a random fragment of the data, a small fraction of data corruption might 
go undetected and hence they do not guarantee the retrievability of the original data. Coding-based ap- 
proaches, on the other hand, have relied on special properties of linear codes such as the Reed-Solomon 
(RS) [22:] code. The key insight is that encoding the data imposes certain algebraic constraints on it which 
can be used to devise an efficient fingerprinting scheme for data verification. Earlier schemes proposed 
by Schwarz et al. (SFC) |[25l and Goodson et al. |[T3l are based on this and are primarily focused on the 
construction of fingerprinting functions and categorically fall under distributed protocols for file integrity 
checking. Later, Juels and Kaliski |fT71 proposed a construction of a proof of retrievability (POR) which 
guarantees that if the server passes the verification of data possession, the original data is retrievable with 
high probability. While the scheme by Juels [ 17 1 supported a limited number of verifications, the theoretical 
POR construction by Shacham and Waters ll26l extended it to unlimited verification and public verifiability 
by integrating cryptographic primitives. Subsequently, Dodis et al. (8] provided theoretical studies on dif- 
ferent variants of existing POR schemes and Bowers et al. @ considered POR protocols of practical interest 
ifTTl l26l and showed how to tune parameters to achieve different performance goals. However, these POR 
schemes only consider the single server scenario and have no construction of a retrievability and storage 
enforcement guarantee in a distributed storage scenario. 

B Omitted Proofs 

B. 1 Proof of Proposition |2] 

Proof. We begin by specifying the protocol. In the preprocessing step, the verifier U does the following on 
input i£ [q] k : 

1. Generate a random /3 £ [n]. 

2. Store (j3, a = hp{x)) (and in the data possession problem, send x to the prover). 
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The prover, upon receiving x, saves a string y G [q\*. The prover is allowed to use any computable function 
to obtain y from x. 

During the verification phase, U does the following: 

1 . It sends ft to the prover. 

2. It receives a G [q] from the prover. (a is supposed to be hp(x).) 

3. It outputs 1 (i.e. prover did not "cheat") if a = a, else it outputs a 0. 

The resource bound follows from the specification of the protocol and the performance guarantee follows 
from the fact that % is (7, /)-storage enforceable. □ 

B.2 Proof of Proposition |3] 

Proof. This proof is straightforward. For i/i/£ [q] k , since Ti is e-almost universal, there exists at most 
en values of i G [n] such that hi(x) = hi{y), which implies that A(H(x), H(y)) ^ (1 — e)n, which proves 
the claim. □ 

B.3 Proof of Theorem M 

Proof. We begin by specifying the protocol. In the pre-processing step, the client U does the following on 
input x G [q] k : 

1 . Generate a random ft G [n] . 

2. Store 71 = H(xi)p, . . . ,j s = H(x s )p) and send X{ to the server i for every i G [s]. 

Server i on receiving x, saves a string y\ G [g]*. The server is allowed to use any computable function 
to obtain yi from X{. 

During the verification phase, U does the following: 

1 . It sends ft to all s servers. 

2. It receives G [q] from server i for every i G [s]. (a, is supposed to be H{xi)p.) 

3. It outputs 1 (i.e. none of the servers "cheated") if a, = 7^ for every % G [s], else it outputs a 0. 

Similar to the one-party result, we assume that server i, on receiving the challenge, uses a computable 
function A x ,i ■ [n] x [q]* — > [q] to compute a, = A x {ft, Hi) and sends ctj back to U. 

The claim on the resource usage follows immediately from the protocol specification. Next we prove 
its verification guarantee. Let T C [s] be the set of colluding servers. We will prove that yx is large by 
contradiction: if not, then using the list decodability of H, we will present a description of xt of size 

IT 

< C(xt). Consider the following algorithm that uses yx and an advice string v G ({0, , which is 
the concatenation of shorter strings v% G ({0, 1}' L ') for each i G T: 

1. Compute a description of H from n, k, pn, q and s. 

2. For every j G T, compute Zj = (A x j(ft, J/j))y3e[n]- 
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3. Do the following for every j £ T: by cycling through all xj £ [q] k ^ s , retain the set £j C [q] k / s such 
that for every u £ £j, A(H(u), Zj) ^ pn. 

4. For each j £ T, let be the Wjth string from £j. 

5. Output the concatenation of {wj jjeT- 

Note that since is (p, L)-list decodable, there exists an advice string v such that the algorithm above 
outputs xt- Further, since H is computable, there is an algorithm £ that can compute a description of H 
from n, kpn, q and s. (Note that using this description, we can generate any codeword H (u) in step 3.) Thus, 
we have description of xt of size \ut\ + \v\+ YljeT l^j'l + 1^1 + ( s + \og(s 2 qL s n 4 ) + 2 log log^n) + 3) 
(where the term in parentheses is for encoding the different parameters and T), which means that if \yx\ < 
C(xt) — \v\ — J2jeT \A x ,j\ — \£\ — (s + log(s 2 qL s n 4 ) + 21oglog(gn) + 3) = f(x), then we have a 
description of xt of size < C(xt), which is a contradiction. □ 

B.4 Proof of Theorem |9] 

Proof. We begin by specifying the protocol. In the pre-processing step, the client U does the following on 
input x £ [q] k : 

1. Generate a random /3 £ [n], 

2. Store 7 = H(x)r) and send to the server % for every i £ [a]. 

Server i on receiving Xi, saves a string j/j £ [q\*. The server is allowed to use any computable function 
to obtain yi from Xj. For notational convenience, we will use Xi to denote the string x,- L extended to a string 
in Fq by adding zeros in positions that correspond to servers other than i. 

During the verification phase, U does the following: 

1. It sends (3 to all s servers. 

2. It receives aj £ [q] from server i for every i £ [s]. (aj is supposed to be H{xi)p.) 

3. It outputs 1 (i.e. none of the servers "cheated") if 7 = J2t=i a i e l se ^ outputs a 0. 

We assume that server i on receiving the challenge, uses a computable function Ax,i ■ [n] x [q]* — > [q] 
to compute ai = A x (f3, yi) and sends back to U. The claim on the resource usage follows immediately 
from the protocol specification. Next we prove its verification guarantee. Let T C [s] be the set of colluding 
servers. We will prove that is large by contradiction: if not, then using the list decodability of H, we will 
present a description of xt of size < C(xt)- 

For notational convenience, define x> = YljeT^j m & %t = J2jgT%j- Consider the following algo- 
rithm that uses yx and an advice string v £ {0, 1}I L I : 

1. Compute a description of H from n, k, p, q, s and L. 

2. Compute z = (Ej G T ^j(/ 3 ^i))/3e[n]- 

3. By cycling through all x £ F k , retain the set £ QF k such that for every u £ C, A(H(u), z) ^ pn. 
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4. Output the vth string from C. 

To see the correctness of the algorithm above, note that for every j G [s] \ T, (A Xj j(f3,yj))p £ [ n ] = 
H(xj). Thus, if the protocol outputs 1 with probability at least 1 — p, then 5(z, H{xt)) ^ pn ; here we used 
the linearity of H to note that H(xt) = H(x) — H(xTp). Note that since H is (p, L)-list decodable, there 
exists an advice string v such that the algorithm above outputs xt (from which we can easily compute xt). 
Further, since H is computable, there is an algorithm £ that can compute a description of H from s,n,k,pn 
and q. Thus, we have a description of xt of size \yx\ + \v\ + SjeT \A x ,j\ + \£\ + ( s + log(s 2 (?Ln 4 ) + 
2 log log (qn) + 3), (where the term in parentheses is for encoding the different parameters and T), which 
means that if \yr\ < C(xt) — \v\ — \A X \ — \£\ — (s + log(s 2 qLn A ) + 21oglog(qn) + 3) = f(x), then we 
have a description of xt of size < C(xt), which is a contradiction. □ 

B.5 Proof of Theorem M 

Proof. We begin by specifying the protocol. As in the proof of Theorem |9l define Xi, for i G [s], to be the 
string xi extended to the vector in F^, which has zeros in the positions that do not belong to server i. Further, 
for any subset T C [s], define &t = SieT^*- Finally l et RS ■ — > ¥ q be a systematic Reed-Solomon 
code where = 2r + e + s. 

In the pre-processing step, the client U does the following on input x G [q] k : 

1. Generate a random j3 G [n]. 

2. Compute the vectors = (H(x\)p, . . . , H{x s )p) G 

3. Store (/?, 71 = RS{v) s+ \, . . . , ^2r+e = RS(v)) and send xi to the server i for every i G [s]. 

Server i on receiving ccj, saves a string j/j G [q\*. The server is allowed to use any computable function 
to obtain j/, from Xj. 

During the verification phase, U does the following: 

1. It sends /3 to all s servers. 

2. For each server i G [s], it either receives no response or receives a, G F g . (a, is supposed to be 

3. It computes the received word z G F , where for i G [s], «j =? (i.e. an erasure) if the ith server does 
not respond else Z{ = ai and for s < i ^ , Zi = 7j. 

4. Run the decoding algorithm for RS to compute the set T' C [s] to be the error locations. (Note that 
by Step 2, U already knows the set E of erasures.) 

We assume that server i on receiving the challenge, uses a computable function A x ,i '■ [n] X [q]* — > [q] 
to compute ai = A x ((3, yi) and sends cjj back to U (unless it decides not to respond). 

The claim on the resource usage follows immediately from the protocol specification. We now prove the 
verification guarantee. Let T be the set of colluding servers. We will prove that with probability at least 1— p, 
U using the protocol above computes 7^ T 1 C T (and \yr \ is large enough). Fix a j3 G [n]. If for this (5, U 
obtains T' = 0, then this implies that for every i G [s] such that server i responds, we have ai = H{xi)p. 
This is because of our choice of RS, the decoding in Step 4 will return v (which in turn allows us to compute 
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exactly the set T'CT such that for every j G X", aj / Thus, if the protocol outputs 

with probability at least 1 — p over the random choices of /3, then using the same argument as in the proof 
of Theorem|9j we note that A(H(xt), (YljeT ■Ax,j(P> % ))/3e[n]) ^ P n - Again, using the same argument as 
in the proof of Theorem |9]this implies that \yx\ ^ C(xt) — s — log(s 2 qLn A ) — 2 log \og(qn) — Co, for some 
fixed positive integer cq. □ 



Hashing Modulo a Random Prime. We will begin with a code that corresponds to the classical Karp- 
Rabin hash lfl~8ll . It is known that the corresponding code H is the so called CRT code. Let H be the 
so called Chinese Remainder Theorem (or CRT) codes. In particular, we will consider the following 
special case of such codes. Let p\ ^ P2 ^ • • • ^ p n be the first n primes. Consider the CRT code 
H '■ Yli=i\Pi\ ~~ ^ Tli=i\Pi\> where the message x e {0, 1, ... , (IliLiPi) — 1}> lS mapped to the vector (x 
mod pi, x mod p2, ■ ■ • , x mod p n ) G ITiLi [Pi]- It i s known that such codes have distance n — k + 1 
(cf. lPT4l ). By a simple upper bound on the prime counting function (cf. Q), we can take p n ^ 2nlogn. 
Moreover, Y^i=iPi < n Pn/2 (cf- Il23l ). Thus, if we pick a CRT code with n = k/e 2 , then by Theorem|4] 
H is (1 — e, A: 2 (log fe — loge 2 )/e 4 ) -list decodable. Given any x G {0, 1, . . . , (PIi=i Pi) ~ 1} an d a random 
(3 G [n], H(x)p corresponds to the Karp-Rabin fingerprint (modding the input integer with a random prime). 
Further, H(x)p can be computed in polynomial time. Thus, letting H be the CRT code in Theorem [U we 
get the following: 

Corollary 17. For every e > 0, there exists an (s, s)-party verification protocol with resource bound 

((s + 1) log k + s — loge 4 + s loglog(/c/e 2 ), s log k — s logs 2 + s + slog log (k/e 2 )) 

with verification guarantee (e, f), where for every x £ {0, 1, ... , Y\i=i Pi ~ 1}> 

f(x) = C{x) - c (s{\og{k/e) - loglog(fc/e) - 1)) - c\ logloglog(A;/e) - c 2 

/or some fixed positive integers Co, ci ancf C2- Further, all honest parties can do their computation in poly(n) 
time. 

Remark 2. Theorem \J0\ can be extended to handle the case where the symbols in codewords of H are of 
different sizes. However, for the sake of clarity we refrain from applying CRT to the generalizat ion of 
Theorem \W\ Further, the results in the next subsection allow for a more efficient implementation of the 
computation required from the honest parties. 

Reed-Solomon Codes. Finally, we take H : — > F™ to be the Reed-Solomon code, with n = q. 
Recall that for such a code, given message x = (xq, . . . , Xk-i) G F^, the codeword is given by H(x) = 

(P x (f3))p & ^ q , where P X {Y) = ^^To 1 x iX l ■ h i s well-known that such a code H has distance n — k + 1. 
Thus, if we pick n = k/e 2 , then by Theorem [4] H is (1 — e, 2/c 2 /e 4 )-list decodable.Let H be the Reed- 
Solomon code (more details in the appendix). Given any x G F^ and a random j3 G [n], H(x)p corresponds 
to the widely used "polynomial" hash. Further, H(x)p can be computed in one pass over x with storage 
of only a constant number of ¥ q elements. (Further, after reading each entry in x, by the Homer's rule, the 
algorithm just needs to perform one additi on and one multiplication over ¥ q .) Thus, applying H as the 
Reed-Solomon code to Theorems 181 and [TOl implies the following: 

Corollary 18. For every e > 0, 

B We will assume that T n E = 0. If not, just replace T by T \ E. 
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(i) There exists an (s, s)-party verification protocol with resource bound ((s + l)(log k + 2 log(l/e) + 
l),2s(log/c + 21og(l/e) + 1)) and verification guarantee (e, /), where for any x 6 F^, /(x) = 
C(x)-0(a(logfc + Iog(l/e))). 

(ii) Assuming at most e servers do not respond to challenges, there exists an (r, s)-party verification 
protocol with resource bound ((2r + e + l)(log k + 21og(l/e) + 1), 2s(log k + 21og(l/e) + 1)) and 
verification guarantee (e, /), where for any x G F^, f(x) = C(x) — 0(s + log k + log(l/e)). 

Further, in both the protocols, honest parties can implement their required computation with a one pass, 
0(log k + log(l/e)) space (in bits) and 0(log k + log(l/e)) update time data strea m algorithm. 
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